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FIELD OF THE INVENTION 

he invention relates to an integrated circuit having a pluralily of processing 
modules and a network arranged for coupling processing modules and a metiiod fer time slot 
allocation in such an integrated circuit, and a data processing system. 

5 BACKGROUND OF THE INVENTION 

ystems on silicon show a continuous increase in complexity due to the ever 
increasing need for implementing new features and improvements of existing functions. This 
is enabled by tibe increasing density with which components can be integrated on an 
integrated circuit. At the same time the clock speed at which circuits are operated tends to 
10—— increase too. The higher clock speed in combination -with the increased density of 
components has reduced the area which can operate synchronously within the same clock 
domain. This has created the need for a modular approach. According to such an approach 
the processing system comprises a pluraUty of relatively independent, complex modules. In 
conventional processmg systems the systems modules usually communicate to each other via 
15 a bus. As the number of modules increases however, this way of communication is no longer 
practical for the following reasons. On the one hand the large number of modules forms a too 
high bus load, and the bus constitutes a communication bottleneck as it enables only one 
deviceto send data to the bus. 

A communication network forms an effective way to overcome these 
20 disadvantages. Networks on chip (NoC) have received considerable attention recently as a 
solution to the interconnect problem in highly-complex chips. The reason is twofold. First, 
NoCs help resolve the electrical problems in new deep-submicron technologies, as they 
structure and manage global wires. At the same time they share wires, lowering thefr number 
and increasing their utilization. NoCs can also be energy efficient and reliable and are 
25 scalable compared to buses. Second, NoCs also decouple computation from communication, 
wbidx is essential in managing the design of billion-transistor chips. NoCs achieve this 
decox5)ling because th^r are traditionally designed using protocol stacks, which provide well- 
defined interfeces separating communication service usage from service irnplementation. 
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Introducing networks as on-chip interconnects radically changes the 
communication when compared to direct interconnects, such as buses or switches. This is 
because of the multi-hop nature of a networic, where communication modules are not directly 
connected, bat are remotely separated by one or more network nodes. This is in contrast with 
the prevalent existing mterconnects (i.e., buses) where modules are directly connected. The 
impUcations of this change reside in the arbitration (which must change from centralized to 
distributed), and in the communication properties (e.g., ordering, or flow conteol), which 
must be handled either by a intellectual property block (TP) or by the network. 

Most of these topics have been ateady fhe subject of research in the field of 
local and wide area networks (con^uter networks) and as an intercoimect for paraUel 
machine interconnect networks. Both are very much related to on-chip networks, and many 
of the results in those fields are also applicable on chip. However, NoC's premises are 
different from off-chip networks, and, therefore, most of the network design choices must be 
reevaluated. On-chip networks have different properties (e.g., tighter link synchronization) 
and constraints (e.g., higher memory cost) leading to different design choices, which 

ultimately-afifect the netwOTk services. 

NoCs diffCT firom ofif-chip networks mainly in their constraints and 
synchroniaation. Typically, resource constraints are tighter on chip than off chip. Storage 
(i.e., memory) and computation resources are relatively more expensive, whereas the number 
of point-to-point links is larger on chip than off chip . Storage is expensive, because general- 
purpose on-chip memory, such as RAMs, occupy a large area. Having the memory 
distributed in the network components in relatively smaU sizes is even worse, as the overhead 
area in the 

Ofif-chip networks typicaUy use packet switching and offer best-effort 
services. Contention can occur at each network node, making latency guarantees very hard to 
offer. Throughput guarantees can stiU be offered using schemes such as rate-based switching 
or deadline-based packet switching, but with high buffering costs. An alternative to provide 
such time-related guarantees is to use time-division multiple access (TDMA) circuits, where 
every circuit is dedicated to a network connection. Circuits provide guarantees at a relatively 
low memory and coniputation cost Network resource utilization is increased when the 
network architecture allows any left-over guaranteed bandwidth to be used by best-effort 
communication. 

A network on chip (NoC) typically consists of a plurality of routers and 
network interfeces. Routers serve as network nodes and are used to transport data from a 
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source network intei&ce to a destinatioa networic intec^ by routing data on a correct path 
to the destination on a static basis (i.e., route is predetennmed and does not change), or on a 
dynamic basis (i.e., route can change depending e.g., on the NoC load to avoid hot spots). 
Routers can also implement time guarantees (e.g., rate-based, deadline-based, or using 
pipelined circuits in a TDMA &shion). More details on a router architecture can be found in, 
A router architectuie for networks oa silicon, by Edwin Rijpkema, Kees Goossens, and Paul 
Wielage, In PROGRESS, October 2001. 

The network intec&ces are connected to an IP block (intellectual property), 
•which may represent any kind of data processmg unit or also be a memory, bridge, etc. In 
particular, tiie network interfeces constitute a communication inter&ce between flie IP blocks 
and the network. The interfece is usually compatible with the existing bus interfeces. 
Accordingly, the network interfeces are designed to handle data sequentialisation (fitting the 
offered command, flags, address, and data on a fixed-width (e.g., 32 bits) si^ial group) and 
packetization (adding the packet headers and trailers needed internally by the network). The 
network interfaces may also implement packet scheduling, which can include timing 

-guarantees-and- admission control. 

On-chip systems often require timing guarantees for their intercoimect 
communication. Therefore, a class of communication is provided, in which throughput, 
latency and jitter are guaranteed, based on a notion of global time (i.e., a notion of 
synchronicity between network components, i.e. routers and network interfaces), wherein the 
basic time unit is called a slot or time slot. All network components usually comprise a slot 
table of equal size for each output port of the network component, in which time slots are 
reserved for different connections and the slot tables advance in synchronization (i.e., all are 
in the same slot at the same time). The connections are used to identify dififerent traffic 
classes and associate properties to them. 

A cost-effective way of providing time-related guarantees (i.e., throughput, 
latency and jitter) is to use pipelined circuits in a TDMA (Time Division Multiple Access) 
i&shion, which is advantageous as it requires less buffer space compared to rate-based and 
deadline-based schemes on systems on chip (SoC) which have tight synchronization. 

At each slot, a data item is moved fixim one netwcnrk component to the next 
one, i.e. between routere or between a router and a network interfece. Therefore, when a slot 
is reserved at an output port, the next slot must be reserved on the following output port along 
the path between an master and a slave module, and so on. 
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Wheal multiple cormectioDs with timing guarantees are set the slot 
aUocation must be perfomied such that there are no clashes (i.e., there is no slot allocated to 
more than one connection). The task of finding an optimum slot allocation for a given 
network topology i.e. a given number of routers and network internes, and a set of 
connections between IP blocks is a highly conqmtational-intensive problem (MP complete) as 
it involves finding an optimal ralution which requires exhaustive computation time. 

It is therefore an object of the invention to provide an improved slot allocation 
in a network on chip environment. 

This object is achieved by an integrated circuit according t» claim 1 and a 
method for time slot allocation according to claim 16 as well as a data processing system 
according to claim 17. 

Tterefore, an integrated circuit con^sing a plurality of processing modules 
and a network arranged for coupling said modules is provided. Said integrated circuit further 
comprises a plurality of network interfaces each being coupled between one of said 
processing modules and said network. Said nettvork comprises a plurality of routers coupled 
via network links to adjacent routersr-Said-processing modules communicate between each - 
other over connections using connection paths through the network, wherein each of said 
connection paths employ at least one network link for a required number of time slots. At 
least one time slot allocating unit is provided for computing a link weight factor for at least 
one network link in said connection path as a ftmction of at least one connection requirement 
for said at least one network link, for computing a connection path weight factor for at least 
one connection path as a function of the computed Unk weight factor of at least one network 
link in said connection path, and for aUocating time slots to said network links according to 
the computed connection path weight fectors. 

Accordingly, a time slot allocation based on the actual connection requirement 
can be implemented. 

According to an aspect of the invention said connectiaQ requirements 
comprise bandwidth, latency, jitter, priority and/or slot allocation requirements of the 
connection path. The tune slot allocation can be implemented and optimized according to one 
of the specific connection requirements. 

According to an aspect of the invention, said at least one time slot allocating 
unit is ad^ted to allocate time slots to said network links in decreasing order of connection 
path weight fector. Therefore, those network Imks requiring more time slots are considered 
first during the time slot allocation as these connections have more constraints, and. 
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therefore, if left at the end, have less chances to find free slots. As opposed to that, shorter 
channels going tihrough less utilized links, have more freedom in finding slots, and can thus 
te left toward the end of the slot allocation. 

According to an aspect of the invention, said at least one time slot allocating 
unit is adapted to compute said connection patih weight factor based on said computed link 
weight factors, the lengfli of said connection path, and the bandwidth, latency, jitter, and/or 
the number of time slots required for said connection path. Therefore, the length of the 
connection path and tiie required amount of time slots may also be considered while 
computing the connection path weight fector. 

According to a further aspect of the invention, said at least one time slot 
allocating unit is adapted to compute the connection path weight fector based on said 
computed link weight factors, the length of said connection path, and liie bandwidth, latency, 
jitter, and/or the number of time slots required for said connection palfa weighted by a first, 
second and third weight factor, respectively. The contribution of the length of the connection 
path and the required bandwidth, latency, jitter, and/or time slots as weU as the link weight 

factors may be varied by adapting-the respective weight factors . 

According to an aspect of the invention, at least one time slot allocation unit is 
arranged in at least one of said plurality of network interfece and comprises a first time slot 
table with entries specifying connections to which time slots are allocated to. Said routers can 
also comprise second time slot tables with entries representing reservations of time slots 
without specifying connections. The slot tables in the routers can be smaller as the 
information, to which a time slot is associated to, does not need to be stored in these slot 
tables. The information per slot can be smaller, however, the slot tables will probably end up 
being larger, because there are multiple ports on a router (slot table size may be #ports * 
#slots * inforaiationjer^slot). 

According to an aspect of the invention, at least one time slot allocation unit is 
arranged in at least one of said plurality of network interfece and comprises a first time slot 
table with entries specifying connections to which time slots are allocated to, and said routers 
comprise second time slot tables witix entries comprising information for routing data in said 
network. Accordingly, as the routing information is stored in the routers, packet headers can 
be omitted leading to a higher throu^put. 

The invention also relates to a method for time slot allocation in an integrated 
circuit having a plurality of processing modules, a network arranged for coupling said 
modules and a plurality of network interfeces each being coupled between one of said 
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processing modules. Said network comprises a pluraUty of routers coupled via network links 
to adjacent routers. The communication between processing modules is performed over 
connections using connection paths flirougli the network, wherein each of said connection 
paths employ at least one network link for a required nurnber of time slots. A link weight 
fector for at least one network link in said connection path is con5)uted as a function of 
connection requirements for said network links. A connection path weight fector for at least 
one connection path is conqjuted as a function of the computed link weight factors of at least 
one network link in a connection path and connection requirements of the said connection 
path. Time slots are allocated to said links according to the computed connection path weight 
Actors. 

The invention also relates to a data processing system comprising a plurality 
of processing modules and a network arranged for coupling said modules. Said integrated 
circuit further comprises a plurality of network interfeces each being coupled between one of 
said processing modules and said network. Said network comprises a pluraUly of routers 
coupled via networic links to adjacent routers. Said processing modules communicate 
-between-each other over connections using connection-paths through the network, wherein 
each of said connection paths employ at least one network link for a required number of time 
slots. At least one time slot allocating unit is provided for computing a link weight factor for 
at least one network link in said connection path as a function of connection requirements for 
said at least one network link, for computing a connection path weight factor for at least one 
connection path as a function of the computed link weight factors of at least one network link 
in said connection path and connection requirements of the said connection path, and for 
allocating time slote to said network links according to the computed connection path weight 
factors. 

Accordingly, the time slot allocation may also be performed in a multi-chip 
network or a system or network with several separate integrated circuits. 

The invention is based on the idea to perform the slot allocations by 
computing a link weight as a function of tiie bandwidth, latency, jitter, and/or numbers of 
slots requested for each channel, i.e. each connection path, using the link and by confuting a 
channel weight as sum of the link weights used by the channel. 

Other aspects of the invention are defined in the dependent claims. 



The invention is now described in more detail with reference to the drawings. 
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Fig. 1 shows the basic structure of a network on chip according to the 

invention; 

Fig. 2 shows a basic slot allocation for a connection in a network according 

to Fig. 1; 

Fig. 3 shows a slot allocation in more detail in a network according to Fig. 1; 
Fig. 4 shows a more detailed slot allocation according to a first embodiment; 
Fig. 5 shows a more detailed slot allocation according to a second 

embodiment; 

Fig. 6 shows a more detailed slot allocation according to a third 

embodiment; 

Fig. 7 shows an illustration of a method for finding firee slots; 

Fig. 8 shows an alternative method for finding free slots; 

Fig. 9 shows a network on chip with several connections; 

Fig. 10 shows a network on chip according to Fig. 9 with computed 

link weight; 

Fig. 1 1 shows a network on chip-according to Fig. 10 with computed 

connection weights; and 

Figs- 12 - 23 show a detailed slot allocation for the connections according to 

Fig. 9-11, respectively. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The following embodiments relate to systems on chip, i.e. a plurality of 
modules on the same chip communicate with each other via some kind of interconnect. The 
interconnect is embodied as a network on chip NOC. The network on chip may include 
wires, bus, time-division multiplexing, switch, and/or routers within a network. At the 
transport layer of said network, the communication between the modules is performed over 
connections. A connection is considered as a set of channels, each having a set of connection 
properties, between a first module and at least one second module. For a connection between 
a first modide and a smgle second module, the connection may comprises two channels, 
namely one from the first module to the second module, i.e. the request channel, and a second 
channel from the second to the first module, i.e. the response channel. The request channel is 
reserved for data and messages fix)m the first to the second, while the response channel is 
reserved for data and messages from the second to the first module. If no response is 
required, the connection may only comprise one channel. However, if the connection 



PHNL040352EPP 



8 23.03.2004 
involves one first and N second modules, 2*N channels are provided Therefore, a connectian 
or the path of the connection through the networic, i.e. the connection path comprises at least 
one channel. M othrar words, a channel corresponds to the connection path of the connection 
if only one channel is used. If two channels are used as mentioned ahove, one channel wiU 
provide the connection path e.g. from flie master to the slave, while the second channel wiU 
provide the connection path from the slave to the master. Accordingly, for a typical 
connection, flie connection path will comprise two channels. The connection properties may 
include ordering (data tiransport in order), flow contixil (a remote buffer is reserved for a 
connection, and a data producer will be allowed to send data only when it is guaranteed that 
space is available for the produced data), tiiroughput (a Iowa- bound on throughput is 
guaranteed), latency (upper bound for latency is guaranteed), tiie lossiness (dropping of data), 
tiansmission termination, tiansaction completion, data correctness, priority, or data delivery. 

Fig. 1 shows a network on chip according to the present invention. The system 
comprises several so-caUed intellectual property blocks IPs (computation elements, memories 
or a subsystem which may internally contain interconnect modules) which are each 
connecfced-to a network N^via a network interfece NI, respectively. The network-N comprises 
a plurality of routers R, which are connected to adjacent routers R via respective links. 

The network interfaces NI are used as interfaces between the IP blocks and the 
network N. The network interfaces NI are provided to manage the commimication of the 
respective IP blocks and the network N, so that the IP blocks can perform their dedicated 
operation without having to deal witii the communication witii the network N or oflier IP 
blocks. The IP blocks may act as masters, i.e. initiating a request, or may act as skives, i.e. 
receiving a request from a master and processing the request accordingly. 

Fig. 2 shows a block diagram of a connection and a basic slot aUocation in a 
network on chip according to Fig. 1. In particular, tiie connection between a master M and a 
slave S is shown. This connection is reaHzed by a network interfece NI associated to flie 
master M, two routers, and a network interfece NI associated to a slave S. The network 
interface NI associated to the master M comprises a time slot allocation unit SA. 
Alternatively, flie networic interfece NI associated to flie slave S may also conqjrise a time 
slot allocation unit SA. A first link LI is present between flie networic intetfece NI associated 
to flie master M and a first router R, a second link L2 is present between flie two routers R, 
and a fliird link L3 is present between a router and flie network interfece NI associated to flie 
slave S. Three slot tables STl - ST3 for flie output ports of flie respective network 
components are also shown. These slot tables are preferably implemented on flie output side. 
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i.e. Ilie data producing side, of the network elements like network internes and routers. For 
each requested slot, one slot is reserved in each slot table of the links along the connection 
path. All these slots must be free, i.e., not reserved by oflier channels. Since the data advance 
from one network conqranent to another each slot, starting from slot s=l, the next slot along 

5 the connection must be reserved at slot sf=2 and then at slot ^3. 

The inputs for the slot allocation determination performed by the time slot 
allocation unit SA are the network topology, like network components, with their 
interconnection, and the slot table size, and the connection set For every connection, its paths 
and its bandwidth, latency, jitter, and/or slot reqairements are given. A connection consists of 

10 at least two channels or connection paihs (a request channel from master to slave, and a 

response channel from slave to master). Each of these channeb is set on an individual path, 
and may comprise different links having different bandwidth, latency, jitter, and/or slot 
requirements. To provide time related guarantees, slots must be reserved for the links. 
Different slots can be reserved for different connections by means of TDMA. Data for a 

15 connection is then transferred over consecutive links along the connection in consecutive 

slots. 

Fig. 3 shows a table implementation of Fig. 2 in more detail. Here, two 
network interfaces Nil, NI2 and two routers Rl, R2 and the three links LI - L3 between the 
network interface Nil and the router Rl, between the router Rl and the router R2, and 
20 between the router Rl and the network interface NI2 are shown, respectively. The IP blocks > 
are not shown. The slot tables STl - ST3 are shown for each of tiie labeled link LI - L3 . 
These links are bi-directional, and, hence, for each link there is a slot table for each of the 
two directions; the slot tables STl - ST3 are only shown for one direction. Additionally, 
three connections cl - c3 are depicted. In addition to flie above tiiree slot tables STl - ST3, 
25 filmier slot tables ST4 - ST6 are shown. Now all slot tables STl - ST6 are shown which are 
related to the three connections cl-c3. The first connection cl extends from the network 
inter&ce Nil to the netwoik mter&ce NI2 via the routers Rl and R2. The second connection 
c2 extends from the network interfece Nil to the router Rl and then to a further network 
component (not shown) using slot table ST4. The third connection c3 may originate from a 
30 not shown network component and passes from the router Rl to the router R2 and fiirfher to 
another not shown network component usmg slot table ST6. The connection cl reserves one 
slot in each of tiie tiiree links LI - L3 it uses (Nil to Rl , Rl to R2, and R2 to NK). The slots 
in these links must be consecutive (slot 2, slot 3, and slot 4, respectively). From a router point 
of view, in a time slot, flie router receives data from input links, on the connection cl - c3 
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those links LI - L3 are reserved for. The data is stored in the router. At the same time, the 
router sends the data it has received the previous slot to output links. According to this 
model, as the data is stored in a router for at most one slot^ the slots of a connection must be 
reserved consecutively. 

A possible generalization or alternative of the slot aUocation problem would 
be to aUow data to be bufifered in the routers for more than one slot duration. As a result^ slot 
aUocation becomes more flexible, which could lead to better link utilization, at the expense of 
more bufifering, and potentially longer latencies. 

Slots must be reserved such that there are no conflicts on links. This is, there 
are no two connections that reserve the same slot of the same link. Therefore, CI reserves 
slot 2 for the link between Nil and Rl. Consequently, C2 cannot use slot 2 for the same link. 

The problem of finding a vaUd slot aUocation (i.e., with consecutive slots, and 
conflict free) which is optimal (i.e.. uses the minimum number of slots) is NP complete. 

Fig. 4 shows a straightforward slot table implementation according to a first 
embodiment by implementing for each of the first, second and third links LI - L3 a table 

which specifies which slots are reserved for-which connection. In particular, only those slot 

tables STl - ST3 are shown, which are required by the three connections cl - c3 for the three 
links LI - L3. A preferred place to store this table is in the router/network interface 
producing data for that link, i.e. the output port, because the router/network interfece has to 
know, when a link is reserved or not, in order to produce data for that link. The table may 
also be part of the time slot aUocation unit SA. Fig. 5 shows a more efScient slot-aUocation 
encoding according to a second embodiment. Here, also only those slot tables STl - ST3 are 
shown, which are required by the three connections cl - e3 for the three unks LI - L3. The 
information to which connection a slot belongs is stored in the network interfece MI and in 
particular in the time slot aUocation unit SA, while the slot tables ST1-ST3 in the routers only 
mark if a slot is reserved or not for the links. The routers need not know the connections 
associated with slots, as they only moves data from one network element to another and 
finaUy to the correct output based on the packet headers (containing a destination address or a 
path to destination). 

In Figure 6, a possible variation according to a third embodiment to the above 
encodmg of Fig. 4 and Fig. 5 is shown. Here, the routing informatian is stored in the router 
itself (instead of the packet header). In output port slot tables STl - ST3, slots indicate from 
which ii^ut data is consumed. In this way, the packet header can be omitted, leading to more 
throughput^ at the expense of larger slot tables in the routers. 
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Now the actual slot allocation function is described, which may be 
implemented in the time slot allocation unit SA. The result leads to a slot allocation which 
corresponds to the slot requirements. For each link in the path of the connection, a weight is 
computed as a function of the bandwidth, latency, jitter priority and/or number of slots 
5 requested for each channel chi in the connection path that uses that link: 

weightQink) = fibandwUthich,)Jatmcyich^),jitteri^^^ 
Wch^ such that link^ ch^ 

Altematively, for each link in the at least one channel in the coimection path a 
10 weight is computed as a sum of the number of slots requested or required for each connection 
path, i.e. each channel, that uses that link: 

weight(link) = sIots(channel) 

lw$ke path (phamd) 

15 Then for each channel in a coimection path, a weight is computed as a function 

(e.g., the sum) of the weights of the links in the channel path as part of the connection path), 
and possibly other properties of the channel (e.g., bandwidlh, latency, priority): 

weight(ch) = f(weightQinkf)Mndwidth(ch)J^^ 
\/linkf e ch 

20 

Or, alternatively, for each channel(i.e. each connection path), a weight is 
computed as the sum of the weights of the liiiks in the channel path: 
weight(channel) = weightQink) 

iinke pathichatmel) 

25 For the alternative above simple functions, 

weightQink) = slotsiphannet) 

linke path(fAamel) 

weight(channel) = weight(link) 

linkepathichamel) 

this algorithm may be implemented by the following pseudo code: 
- Compute link weights 
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FOR all channels C DO 
FOR all link L e path(C) DO 
weight[L]+ = slots[C] 
EINDFOR 
S END FOR 



- Compute cshatmel weights 
FOR all channels C DO 
FOR all link L e path(C) DO 
10 weight[C] + = weight[L] 

END FOR 
END FOR 



- Petfoim slot allocation 
15 BEGIN 

FOR all channels C sorted decreasingly-by weight(C) DO - 

Find slots[C] free slots and allocated them to C 
END FOR 



20 Slots are aUocated to the channels in the decreasing order of their calculated 

weights. For each requested slot, there is one slot reserved in each slot table of the links along 
the channel path. AU these slots must be free, i.e. not reserved previously by other channels. 

These s1ot<: mav b(» sHJcwaff A 

_ „^ ^ ^ v..***^ M^vt wxAWJL xonauuvj^jji. ouuiuig ixLJin 51 paiticuiar slot, SL 

number of slots are checked until a free one is found in all of the links along the path. 

25 Slots can be tried for aUocation using different policies. Examples are 

consecutive slots, or evenly distributed slots. The reason multiple poUcies are needed is that 
different properties can be optimized with different policies. For exanqjle, consecutive slots 
may reduce header overhead, while evenly distributed slots may reduce latency. 

The proposed technique has a low complexity of 0(C x L x S), where C is Ihe 

30 number of channels, L is Ihe number of links, and S is the slot table size. The slot allocations 
obtained with this algorithm are comparable to the optimum (obtained at a much hi^er 
complexity: 0(S°)), and a fector of 2 better than a greedy algorithm (i.e., vidth a random order 
for channel allocation): 
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An altemativB example algorithm is now described. Again, for each link, a 
weight is computed as the sum of the nomber of slots requested for each channel that uses 
that link: 

weight(link)= slots(channel) 

Then for each channel, a weight is computed as the sum of the weights of the 
links the chamiel path: 

weightichannel) = a,x £ weightilink) + xlengthichafmel) + a, y^slotsichatmel) 

Imke patkichannel) 

where ai, 03, and 03 are constants (this is an example of weight formulas, but others are 
also possible). 

This example algorithm may be implemmted by the foUowing pseudo code: 

- Compute link weights 
FOR aiichannels C DO 
FOR all Hnk L € path(C) DO 
weight[L] -H= slots[C] 

END FOR 
END FOR 

- Compute channel weights 
FOR all channels C DO 
FOR all link L e path(C) DO 
weight[C] -4= weight[L] 
END FOR 

weight[C] = ai X weight[C] + aj x length[C] + as x slots[C] 
END FOR 



- Perform slot allocation 
BEGIN 

FOR all channels C sorted decreasingly by weight(C) DO 
Find slots[C] free slots and allocate them to C 
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END FOR 

The computation of the link weights according to the second embodiment is as 
described in the first code, but the channel weights arc calculated differently. The idea behind 
5 this channel weight formula is to start the scheduling with the channels requiring more slots 
as they pass through frequently used links, i.e. going through hot spots (links with a high 
load, and, hence, a large slot to be reserved), and channels having long paths by given a 
higher weight such that Ihey are scheduled first. The reason is that these connections have 
more constraints, and, therefore, if left at the end, have less chances to find free slots. As 
10 opposed to thati shorter channels going tiirough less utilized links, have more freedom in 
finding slots, and can thus be left toward the end of the slot allocation. 

Slots may be allocated to the channels (i.e. each connection path) in the 
decreasing order of their computed weights. For each requested slot, there is one slot reserved 
in each slot table of the links along the channel path as shown in Fig. 2. All these slots must 
15 be free, i.e., not reserved previously by other channels (i.e. each connection path). These slots 
are allocated in a trial-and error-feshion-starting from a particular slot, slots are cheeked until 
the required number of slots are found free in aU links along flie patii. An example algoritimi 
trace is presented in the following section. 

Fig. 7 shows a method of finding free time slots. Here, as an example, a slot 
20 table of size 16 is depicted. The slot finding process can be performed in various ways. One 
example is to find slots in a greedy way, i.e., the first N free slots. Another example is to find 
slots equally distanced in order to minimize buffer space. This can be done by finding a first 

fiiee slot ffs. then f^nmmiriTIO- ryntri'ti^r^n *U^* • . . . . , _ , 

- _ . — — ^ ^^^^^^^^ aic «^4iuuiy uisumcea m me siot table, and then 

searching locally around the computed positions to find fhe nearest free position. The slots 

25 that are already reserved are marked with a cross in Fig. 7. The first fi^ slot fis is slot 2. As 
there are 16 slots, the ideal positions (2nd position ip2, 3id position ip3) for the other two 
slots would be slot 7 and 13 (to get an equal distance between them), respectively. As slot 7 
is ahready reserved, a free slot is searched in fbs neighborhood of slot 7. The nearest free slot 
found is slot 5. As slot 13 is fise, and it can be reserved as well. Consequentiy, tiie three slots 

30 reserved are 2, 5, and 13 and are mariced with a black balL 

It should be noted that free slots for a connection are those that are free along 
the complete path, i.e. consecutive time slots should be free in consecutive links along the 
path. Therefore/all slot tables along a connection path must be checked to find a free slot for 
a particular connection. A simple way of searching firee slots for a connection is to start from 
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the first link of the connectio3ti, and try all subsequent slot tables along the path, skipping 
those reserved. To minimize the searching time, one may also start firom the most loaded link. 

Fig. 8 shows another technique to speed-up the searching of free slots for the 
case where only the slot reservation is stored (using 1 bit) as described in Fig. 5. It is based 
on checking multiple slots in parallel. This can be performed both in hardware (a unit to 
check any fixed number of bits, i.e. Hie time slot allocation unit SA), and in software (CPU 
data words can store e.g., 16 or 32 slot reservation simultaneously). On the left hand side of 
Fig. 8 slot tables 1st for links LI to L4 are shown as an example. On the right hand side free 
slot words fsw , which are used to determine the fise slots along the path are shown. Free 
slots are found by traversing the slot tables and filtering the reserved ones, and shifting 
(»(1)) the searched slots with one position at each link (corresponding to the required slot 
alignment). Firstly, the first link Llof the path is chosen, which comprises reserved slots 0, 1, 
6, 9, 1 1, 12, and 14. These slots are marked as reserved , e.g. by an 'X' in the fi:ee slot word. 
Thereafter, the free slot word fsw is shifted with one position to the right to reflect the slot 
alignment, and OR-ed, i.e. an OR operation is performed, to add the reserved slots of the 

second Unk (slots 3, 6, 10, and-12).-^These steps are repeated for the third and fourth>link-L,3j 

L4, which results in a vector of free slots in all the links along the path. To align it to the first 
link LI , it is shifted to the left with three position. The result is that slots 4, 8, 10, and 13 of 
the first link LI are free in all links LI - L4 along the given path. In particular, slot 4 is free 
for Unk LI, while slots 5, 6, and 7 are free for links L2, L3 and L4, respectively. Slot 8 is free 
in link LI, while slots 9, 10, and 1 1 are free for Unks L2, L3 and L4, respectively. In addition, 
slotlOis free in link LI, while slots 11, 12, and 13 are free for links L2, L3 andL4, 
respectively. Slot 13 is free in link LI, while slots 14, 15, and 1 are free for links L2, L3 and 
L4, respectively. 

Fig. 9 shows an example of a network on chip consisting of 4 routers Rl - R4 
and 7 network interfeces NI (Nil - NI7). The IP block, with which the network interfeces are 
connected, are not shown. As an example 12 connections CI - C12 are selected. These 
connections are used to transport data betwem (the not shown) IP modules attached to 
network interfaces Mis, and, therefore, the connections are always set between two network 
interfeces NIs. For the sake of simplicity, we assume that all connections are unidirectional 
(consist of one channel), although in practice bi-directional connections (two channels) may 
also exists. For example, connection CI starts at Nil, and goes through Rl and R4 to reach 
NI6. Shnilarly, connection C2 goes through Nil, Rl, R2, R3, and NI4. Connection C3 goes 
throu^ Nil, Rl, R2 and NI2. Connection C4 goes through NI2, R2, R3, R4, and NI7. 
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Connection C5 goes through NI2, 112, and NI3. Connection C6 passes through NI3, R2, Rl, 
and Nil . Connection C7 passes through NB, R2, R3, and NI5. Connection C8 passes Ihough 
NI4, R3, and NI5. Connection C9 passes though NI5, R3, R4, Rl, and Nil. Connection CIO 
passes though NI6, R4, R3, R2, and N12. Connection CI 1 passes though NI7, R4, R3, R2, 
and NB. Connection C12 passes Ihough NI7, R4, Rl, and Nil. For each connection, a 
number of slots must be reserved. These numbers are listed on the ri^t side of the Fig. 9, i.e. 
connection C1-C12 require 1,1,5, 2, 4, 3, 2, 6, 1, 2, 1, and 2 slots, respectively. 

Fig. 10 shows a network on chip according to Fig. 9. The algorithm starts by 
computing the link weights. This is done by summing for each link the number of slots 
requested fiir all the connections that use that link. This is performed separately for each 
direction. The links are encircled and a link weight result is written next to it. For example, 
for the link &om R2 to R3 the slots requested by connections C2, C4 and C7 are added. This 
results in a link weight of 1 (C2) + 2 (C4) + 2 (C7) = 5. The link between Nil and Rl 
requires 7 slots, the link between Rl and Nil 6 slots etc. 

Fig. 1 1 shows a network on chip according to Fig. 10. The algorithm computes 
-—the connections weights. In flie connection weight-formula, al = 1, a2 = 0, and aS = 0 (i.e., 
sum link weights). Alternatively, different values may be selected. The result of the 
connection weight are shown on the right hand side of Fig. 11. For example, the weight of 
connection CI is the sum of the weights of links Nil to Rl, Rl to R4, and R4 to NIIO, which 
is 7+ 1+ 1 = 9. The connections are then sorted decreasingly with regard to the computed 
weight fector, and scheduled in that order. 

In Figures 12 to 23, the time slot allocation for all connections CI - C12 are 

shown. It is aRSimifiH that firortTZ tVl<» rormiraA -A-ao _11 j._ J A _ . « • . 

- . . jLi. i ^_„„j, ^w^w^w.^ iiww ax\jia tunjiitticu. iui cxampic, mc Slot 

tables have a size of 9 slots. The depicted numbers in the slot tables correspond to the 
respective connections CI -C12 to which these slots have been allocated to. 

Figs. 12, 13, 14, 15 16, 17, 18, 19, 20, 21, 22, 23 show flie allocation of the 
connections C3. C2, C7. C4, CI 1, CIO, C6, C8, C9, C12, C5, and CI, respectively. 

According to Fig. 12, for connection C3 requiring 5 slots, all slots are free, 
and, hence slots 1 to 5 of link Nil to Rl, slots 2-6 of the link Rl to R2, and slots 3-8 of flie 
link R2 to NI2 are allocated to it According to Fig. 13, for connection C2 requiring one slot, 
the first 5 slots are already reserved in the first link, and, hence, it reserves slot 6, 7, 8 and 9 
in the respective slot tables along the path. According to Fig. 14, connection C7 requiring 2 
slots has no conflicts in the first two slots, and, flierefore, aUocates them. According to Fig. 
15, connection C4 requiring 2 slots can only reserve slots 3 and 4, as the first two are 
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reserved for C7 in the second link (R2 to R3). According to Fig. 16 connection CI 1 again has 
no conflicts, and reserves the first slot in the slot table in the link of network interfece MI? 
and router R4 as well as Ihe consecutive slots in the slot tables in the other links. 

In the case of connection CIO requiring 2 slots according to Fig. 17, however, 
thefirst4 slots conflict with the slots reserved for C3 in the link R2 to NE 
first firee slots are 5 and 6. As shown in Fig. 18, connection C6 allocates three slots, namely 
slots 3-5, in the link of network inter&ce NI3 and router R2; slots 4-6 in the slot table of the 
link of router R2 and router Rl; and slots 5-7 in the slot table m the link of router Rl and 
network interfece Nil. According to Fig. 19, connection C8 allocates 6 slots, namely slots 1, 
4-8, in the link of network interface NI4 and router R3; and slots 2, 5-9 in the slot table of the 
link of router R3 and network interface NI5, as the slots 3-4 m the slot table of the link of 
router R3 and the network interface NI5 are already allocated or reserved to connection 7. 

As shown in Fig. 20, connection C9 allocates one slot, namely slots 1, in the 
link of network interface NI5 and router R3; slot 2 in the slot table of the link of router R3 
and router R4; slot 3 in the slot table of the link of router R4 and router R5; and slot 4 in the 
slot table in the link of router Rl and network interface Nil. -According to Fig. 21, 
connection CI 2 allocates two slots, namely slots 6 - 7 in the link of network interface NI7 
and router R4; slots 7-8 in the link of router R4 and router Rl ; and slots 8-9 in the slot table 
of the link of router Rl and network interface Nil, as the slot 4 and slots 5-7 in the slot table 
of the link of router Rl and the network interface Nil are already allocated to connection C9 

and C6, respectively. 

As depicted in Fig. 22, connection C5 allocates 4 slots, namely slots 1-2 and 5- 
6, in the slot table of the link of network interface NI2 and router R2; and slots 2-3 and 6-7 in 
the slot table of the link of router R2 and network interface N13, as the slot 3-4 in the slot 
table of the link of network interface NI2 and router R2 are already allocated to connection 
C4. 

Finally, in Fig. 23 connection CI allocates one slot, namely slot 7, in the slot 
table of the link of network interfece Nil and router Rl; slot 8 in the slot table of the link of 
router Rl and router R4; and slot 9 in the slot table of the link of router R4 and network 
interfece NI6. Accordingly, the end result of the slot allocation is shown in Fig. 23. 

Although in the above, the time slot allocation unit is described as being 
arranged in the network interfaces, the time slot allocation unit may also be arranged in the 
routers within the network. 
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The above described time slot allocation can be applied to any data processing 
device comprising several separated integrated dicuits or multi-chip networks, not only to a 
network on a single chip. 

It should be noted that the above-mentioned embodiments illustrate ratfaer lhan 
5 limit Hog invention, and that those skilled in the art will be able to design many alternative 
embodiments without departing fixjm the scope of the appended claims. In the claims, any 
reference signs placed between parentheses shall not be construed as limiting the claim. The 
word "con^jrising" does not exclude the presence of elements or steps other than those listed 
in a claim. The word "a" or "an" preceding an element does not exclude the presence of a 
10 plurality of such elements. In the device claim enumerating several means, several of these 
means can be embodied by one and the same item of hardware. The mere feet that certain 
measures are recited in mutually different dependent claims does not indicate that a 
combination of these measures cannot be used to advantage. 

Furthermore, any reference signs in the claims shaU not be construed as 
IS limiting the scope of the claims. 
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CLAIMS: 



1 . Integrated circuit comprising a plurality of processing modules (M, S; IP) and 

a network (N) arranged for coupling said modules (M, S; IP), comprising 

a plurality of network interfaces (Ml) each being coupled between one of said 
processing modules (M, S; IP) and said network (N); 
5 wherein said network (N) comprises a plurality of routers (R) coupled via network links (L) 
to adjacent routers (R); 

wherein said processing modules (M, S; IP) commmiicate between each other over 
connections using connection paths (CI -C12) flrrough the network (N), wherein each of said 
connectian palhs (CI -C12) enoploy at least one network link (L) for a required number of 
10 time slots, 

at least one time slot allocating unit (SA) for conxputing a link weight fec*or^ 
for at least erne network li^ (L) in said connection path (C1-C12) as a ftmction of at least one 
connection requirement for said at least one network link (L), for computing a connection 
path weight fector for at least one connectian path (C1-C12) as a function of the computed 
15 link weight fector of at least one network link (L) in said connection path (C1-C12) , and for 
allocating time slots to said network links (L) according to the computed connection path 
weight Actors. 

2. Integrated circuit according to claim 1, wherein said at leMt one time slot 

20 allocating unit (SA) is further adapted to compute a connection palh weight fector for at least 
one connection path (CI -CI 2) as a ftmction of said at least one connection requirement of the 
said connection path. 

3. Integrated circuit according to claim 1 or 2, wherein said connection 

25 requirements comprise bandwidth, latency, jitter, priority and/or slot aUocation requirements 
of the connection path (C1-C12). 

4. Integrated circuit according to claim 2 or 3, wherein said at least one time slot 
allocating unit (SA) is adapted to compute said function for coniputing link weights as a 
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weighted sum of bandwidth and/or slot table requirements for said at least one network link 
(L). 

5. Integrated circuit according to claim 3 or 4, wherein said at least one time slot 
allocating unit (SA) is ad^rted to conq)ute said function for computing connection path 
weights as a weighted sum of flie conqmted link weight factors of at least one network link 
(L) in said connection path (C1-C12) and Ihe bandwidth, latency, jitter, priority and/or slot 
allocaticm requirements of the said coimection pafli (C1-C12). 

6. Integrated circuit according to claim 1 or 3, wherein said at least one time slot 
allocating unit (SA) is adapted to aUocate time slots to said at least one netwoik link (L) in 
decreasing order of connection path weight :&ctar. 

7. Integrated circuit according to claim 1 or 3, wherein said at least one time slot 
allocating unit (SA) is adapted to compute said connection path weight factors based on said 

computed link weight factors, the-length-of said connection path (C1-C12), and the number- 

of time slcui required fer said connection path (CI -CI 2). 

8. Integrated circuit according to claim 7, whereia said at least one time slot 
allocating unit (S A) is adapted to compute said connection path weight factors based on said 
computed link weight factors, the length of said connection path (CI -CI 2), and the number 
of time slots required for said connection path weighted by a first, second and third weight 
j&ctor (al , a2, a3), respectively. 

9. Integrated circuit according to claim 7, wherein at least one time slot 
allocating unit (SA) is arranged in at least one of said pluraUty of network interface QU) and 
comprises a first time slot table (ST) with entries specifying connections to which time slots 
are allocated to, and said routers (R) comprise second time slot tables (ST) with entries 
representing reservations of time slots without specifying connections. 

10. Integrated circuit according to claim 9, wherein said routers (R) move data 
arranged in packets with packet headers according to said packet headers. 
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1 1 _ iBte^ated ciicuit accorduig to claim 1, or 2 , wherein at least one time slot 

allocating unit (SA) is arranged in at least one of said plurality of netwoik interfece (NT) and 
comprises a first time slot table (ST) with entries spedifying connections to which time slots 
are allocated to, and said routers (R) comprise second time slot tables (ST) with entries 
comprising information for routing data in said network (N). 

12. Integrated drcuit according to claim 9 or 1 1, herein said time slot allocating 

unit (SA) is adapted to find the first fi^e time slots in said first and second slot tables (STl, 
ST2) along said connection paths (C1-C12) according to the required time slots of said 
connections (C1-C12). 

1 3 Integrated circuit according to claim 9 or 1 1 , said time slot allocating unit 

(SA) is adapted to find the required firee time slots in said first and second time slot tables 
(STl, ST2) for said connections (C1-C12) by finding at least a first firee time slot in one of 
said first and second slot tables (STl, ST2), by computing positions which are equaUy 
distanced in the slot table, and by-searching locally-around tiie computed positions to find-the 
nearest firee time slot 

14. Integrated circuit according to claim 13, wherein the search for firee time slots 
is started fi»m the most loaded network link (L). 

15. Integrated circuit according to claim 9 or 11, wherein said time slot allocating 
unit (SA) is adapted to fimd the required firee time slots in said first and second time slot 
tables (STl, ST2) for said connections (C1-C12) by traversing said slot tables (STl, ST2), by 
filtering tibie reserved time slots, and by shifiing the searched time slots with one position for 
each ndwoik link (L) in said connection (C1-C12). 

16. Mdhod for time slot allocation m an integrated circuit comprising a plurality 
of processmg modules (M, S; IP) and a network (N) arranged for coupling said modules (M, 
S; IP), and apluraHty of network interfeces (NT) each being coupled between one of said 
processing modules (M, S; BP) and said network (N) comprising a pluraUty of routers (R) 
coupled via netwoik links (L) to adjacent routers (R); comprising the steps of: 

communicating hetween processing modules (M, S; IP) over connections 
using connection paths (CI -C12) through the netwoik (N), wherein each of said connection 
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paths (CI -C12) einploy at least one network link (L) for a required number of time slots, 

computing a link wdght fector for at least one network link (L) in sw^ 
connection path (CI -C12) as a function of at least one connection requirement fi>r said 
networic links (L), 

computing a connection path weight fector for at least one connection path 
(C1-C12) as a function of the computed link weigjit fector of at least one network link (L) in 
a connection path (CI -C12), and 

allocating time slots to said network links (L) according to the computed 
connection path weight &ctors. 

17. Data processing system conqnising: 

a pluraKty of processing modules (M, S; IP) and a network (N) arranged for 
coupling said modules (M, S; IP), comprising: 

a plurality of network interfiaces (NI) each being coupled between one of said 
processing modules (M, S; IP) and said network (N); 
-wherein saidnetwork (N) comprises a pluraUly of routers (R) coupled via network links (L) 
to adjacent routers (R); 

wherein said processing modules (M, S; IP) communicate between each other over 
connections using connection paths (CI -C12) through the network (N), wherein each of said 
connection paths (CI -C12) employ at least one network link (L) for a required number of 
time slots, 

at least one time slot allocating unit (SA) for computing a link weight factor 
for at least one network link (L) in said connection path (C1-C12) as a function of at least one 
connection requirement for said at least one network link (L), for computing a connection 
path weight fector for at least one connection path (C1-C12) as a function of the computed 
link weight factor of at least one network link (L) in said connection path (C1-C12), and for 
allocating time slots to said network links (L) according to the computed connection path 
weight fectors. 
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ABSTRACT: 



An integrated circuit comprising a plurality of processing modules (M, S; IP) 
and a network (N) arranged for coupling said modules (M, S; IP) is provided. Said integrated 
oinjuit fturther comtprises a plurality of network interfaces (NT) each being coupled between 
one of said processing modules (M, S; IP) and said network (N). Said network (N) comprises 
5 a plurality of routers (R) coupled via network links (L) to adjacent routers (R). Said 
processing modules (M, S; IP) communicate between each other over comiections using 
connection paths (CI -C12) through the network (N), wherein each of said connection paths 
(C1-C12) enq>loy at least one network link (L) for a required number of time slots. At least 
one time slot allocating unit (SA) is provided for computing a link weight &ctca: for at least 

10 one networic link (L) in said connectiaa path (C1-C12) as a fbnction of at least one 

connection requirement for said at least one network link (L), for computing a connection 
pafhweight&ctor for at least one comiection path (C1-C12) as a function of the computed 
link weight ^tor of at least one network link (L) in said connection path (C1-C12), and for 
allocating time slots to said network links (L) according to the computed comiection path 

IS weight fectors. 
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